Update CONTRIBOP.MD FP16 support for MatMulIntegerToFloat #18751

raoanag · 2023-12-07T22:11:15Z

Description

Motivation and Context

fdwr · 2023-12-08T02:15:53Z

docs/OperatorKernels.md

 |MultiHeadAttention|*in* query:**T**<br> *in* key:**T**<br> *in* value:**T**<br> *in* bias:**T**<br> *in* key_padding_mask:**M**<br> *in* relative_position_bias:**T**<br> *in* past_key:**T**<br> *in* past_value:**T**<br> *out* output:**T**<br> *out* present_key:**T**<br> *out* present_value:**T**|1+|**M** = tensor(int32)<br/> **T** = tensor(float), tensor(float16)|
 |NhwcConv|*in* X:**T**<br> *in* W:**T**<br> *in* B:**T**<br> *out* Y:**T**|1+|**T** = tensor(float), tensor(float16)|
+|QAttention|*in* input:**T1**<br> *in* weight:**T2**<br> *in* bias:**T3**<br> *in* input_scale:**T3**<br> *in* weight_scale:**T3**<br> *in* mask_index:**T4**<br> *in* input_zero_point:**T1**<br> *in* weight_zero_point:**T2**<br> *in* past:**T3**<br> *out* output:**T3**<br> *out* present:**T3**|1+|**T1** = tensor(int8), tensor(uint8)<br/> **T2** = tensor(int8), tensor(uint8)<br/> **T3** = tensor(float), tensor(float16)<br/> **T4** = tensor(int32)|


Hmm, are these other changes intended? It's supposed to be an update for MatMulIntegerToFloat, but why do I also see QAttention and QLinearConcat?

Both files need to be updated, kernelDocumentation pipeline stage also verifies these file contents

Both files need to be updated

I'm not disputing that we need them, but I am seeking to know why we're seeing these other changes. I'll presume we're seeing them because they are stale after a merge from main and sign off.

(note, if we ever generate this locally and see modifications under other providers, like CUDA, then we should not merge those since we don't build locally with CUDA)

fdwr · 2023-12-09T01:06:50Z

docs/OperatorKernels.md

 |MultiHeadAttention|*in* query:**T**<br> *in* key:**T**<br> *in* value:**T**<br> *in* bias:**T**<br> *in* key_padding_mask:**M**<br> *in* relative_position_bias:**T**<br> *in* past_key:**T**<br> *in* past_value:**T**<br> *out* output:**T**<br> *out* present_key:**T**<br> *out* present_value:**T**|1+|**M** = tensor(int32)<br/> **T** = tensor(float), tensor(float16)|
 |NhwcConv|*in* X:**T**<br> *in* W:**T**<br> *in* B:**T**<br> *out* Y:**T**|1+|**T** = tensor(float), tensor(float16)|
+|QAttention|*in* input:**T1**<br> *in* weight:**T2**<br> *in* bias:**T3**<br> *in* input_scale:**T3**<br> *in* weight_scale:**T3**<br> *in* mask_index:**T4**<br> *in* input_zero_point:**T1**<br> *in* weight_zero_point:**T2**<br> *in* past:**T3**<br> *out* output:**T3**<br> *out* present:**T3**|1+|**T1** = tensor(int8), tensor(uint8)<br/> **T2** = tensor(int8), tensor(uint8)<br/> **T3** = tensor(float), tensor(float16)<br/> **T4** = tensor(int32)|


Both files need to be updated

I'm not disputing that we need them, but I am seeking to know why we're seeing these other changes. I'll presume we're seeing them because they are stale after a merge from main and sign off.

(note, if we ever generate this locally and see modifications under other providers, like CUDA, then we should not merge those since we don't build locally with CUDA)

Update CONTRIBOP.MD FP16 support for MatMulIntegerToFloat

9cb1f76

raoanag requested review from fdwr, jstoecker and zhangxiang1993 December 7, 2023 22:11

fdwr reviewed Dec 8, 2023

View reviewed changes

fdwr approved these changes Dec 9, 2023

View reviewed changes

Jamather merged commit 8b8f774 into WindowsAI Dec 13, 2023
46 of 48 checks passed

Jamather deleted the user/anagrao/contribopupdates branch December 13, 2023 21:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update CONTRIBOP.MD FP16 support for MatMulIntegerToFloat #18751

Update CONTRIBOP.MD FP16 support for MatMulIntegerToFloat #18751

raoanag commented Dec 7, 2023

fdwr Dec 8, 2023 •

edited

Loading

raoanag Dec 8, 2023

fdwr Dec 9, 2023 •

edited

Loading

fdwr Dec 9, 2023 •

edited

Loading

Update CONTRIBOP.MD FP16 support for MatMulIntegerToFloat #18751

Update CONTRIBOP.MD FP16 support for MatMulIntegerToFloat #18751

Conversation

raoanag commented Dec 7, 2023

Description

Motivation and Context

fdwr Dec 8, 2023 • edited Loading

Choose a reason for hiding this comment

raoanag Dec 8, 2023

Choose a reason for hiding this comment

fdwr Dec 9, 2023 • edited Loading

Choose a reason for hiding this comment

fdwr Dec 9, 2023 • edited Loading

Choose a reason for hiding this comment

fdwr Dec 8, 2023 •

edited

Loading

fdwr Dec 9, 2023 •

edited

Loading

fdwr Dec 9, 2023 •

edited

Loading